Goto

Collaborating Authors

 Bristol


Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models

Bazaga, Adrián, Blloshmi, Rexhina, Byrne, Bill, de Gispert, Adrià

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have emerged as powerful tools for generating coherent text, understanding context, and performing reasoning tasks. However, they struggle with temporal reasoning, which requires processing time-related information such as event sequencing, durations, and inter-temporal relationships. These capabilities are critical for applications including question answering, scheduling, and historical analysis. In this paper, we introduce TISER, a novel framework that enhances the temporal reasoning abilities of LLMs through a multi-stage process that combines timeline construction with iterative self-reflection. Our approach leverages test-time scaling to extend the length of reasoning traces, enabling models to capture complex temporal dependencies more effectively. This strategy not only boosts reasoning accuracy but also improves the traceability of the inference process. Experimental results demonstrate state-of-the-art performance across multiple benchmarks, including out-of-distribution test sets, and reveal that TISER enables smaller open-source models to surpass larger closed-weight models on challenging temporal reasoning tasks.


PA-RAG: RAG Alignment via Multi-Perspective Preference Optimization

Wu, Jiayi, Cai, Hengyi, Yan, Lingyong, Sun, Hao, Li, Xiang, Wang, Shuaiqiang, Yin, Dawei, Gao, Ming

arXiv.org Artificial Intelligence

The emergence of Retrieval-augmented generation (RAG) has alleviated the issues of outdated and hallucinatory content in the generation of large language models (LLMs), yet it still reveals numerous limitations. When a general-purpose LLM serves as the RAG generator, it often suffers from inadequate response informativeness, response robustness, and citation quality. Past approaches to tackle these limitations, either by incorporating additional steps beyond generating responses or optimizing the generator through supervised fine-tuning (SFT), still failed to align with the RAG requirement thoroughly. Consequently, optimizing the RAG generator from multiple preference perspectives while maintaining its end-to-end LLM form remains a challenge. To bridge this gap, we propose Multiple Perspective Preference Alignment for Retrieval-Augmented Generation (PA-RAG), a method for optimizing the generator of RAG systems to align with RAG requirements comprehensively. Specifically, we construct high-quality instruction fine-tuning data and multi-perspective preference data by sampling varied quality responses from the generator across different prompt documents quality scenarios. Subsequently, we optimize the generator using SFT and Direct Preference Optimization (DPO). Extensive experiments conducted on four question-answer datasets across three LLMs demonstrate that PA-RAG can significantly enhance the performance of RAG generators. Our code and datasets are available at https://github.com/wujwyi/PA-RAG.


A quantitative and typological study of Early Slavic participle clauses and their competition

Pedrazzini, Nilo

arXiv.org Artificial Intelligence

This thesis is a corpus-based, quantitative, and typological analysis of the functions of Early Slavic participle constructions and their finite competitors ($jegda$-'when'-clauses). The first part leverages detailed linguistic annotation on Early Slavic corpora at the morphosyntactic, dependency, information-structural, and lexical levels to obtain indirect evidence for different potential functions of participle clauses and their main finite competitor and understand the roles of compositionality and default discourse reasoning as explanations for the distribution of participle constructions and $jegda$-clauses in the corpus. The second part uses massively parallel data to analyze typological variation in how languages express the semantic space of English $when$, whose scope encompasses that of Early Slavic participle constructions and $jegda$-clauses. Probabilistic semantic maps are generated and statistical methods (including Kriging, Gaussian Mixture Modelling, precision and recall analysis) are used to induce cross-linguistically salient dimensions from the parallel corpus and to study conceptual variation within the semantic space of the hypothetical concept WHEN.


Large Language Models Can Learn Temporal Reasoning

Xiong, Siheng, Payani, Ali, Kompella, Ramana, Fekri, Faramarz

arXiv.org Artificial Intelligence

Large language models (LLMs) learn temporal concepts from the co-occurrence of related tokens in a sequence. Compared with conventional text generation, temporal reasoning, which reaches a conclusion based on mathematical, logical and commonsense knowledge, is more challenging. In this paper, we propose TempGraph-LLM, a new paradigm towards text-based temporal reasoning. To be specific, we first teach LLMs to translate the context into a temporal graph. A synthetic dataset, which is fully controllable and requires minimal supervision, is constructed for pre-training on this task. We prove in experiments that LLMs benefit from the pre-training on other tasks. On top of that, we guide LLMs to perform symbolic reasoning with the strategies of Chain of Thoughts (CoTs) bootstrapping and special data augmentation. We observe that CoTs with symbolic reasoning bring more consistent and reliable results than those using free text.


Remote Computer Vision Engineer openings in New York on August 23, 2022 – Data Science Jobs

#artificialintelligence

Role requiring'No experience data provided' months of experience in San Francisco We are a startup within an enterpise business and have huge growth plans! Our product relies on Computer Vision to make it easier for customers to choose between different product offerings. We are headquartered in the Bay Area but have engineers throughout the country! With a remote-first culture, we strongly believe in collobaration via Microsoft Teams. Our software is used daily by millions of customers globally and we are still gaining new customers, we have exciting plans for the future!


Deep Artificial Intelligence for Fantasy Football Language Understanding

Baughman, Aaron, Forester, Micah, Powell, Jeff, Morales, Eduardo, McPartlin, Shaun, Bohm, Daniel

arXiv.org Artificial Intelligence

Fantasy sports allow fans to manage a team of their favorite athletes and compete with friends. The fantasy platform aligns the real-world statistical performance of athletes to fantasy scoring and has steadily risen in popularity to an estimated 9.1 million players per month with 4.4 billion player card views on the ESPN Fantasy Football platform from 2018-2019. In parallel, the sports media community produces news stories, blogs, forum posts, tweets, videos, podcasts and opinion pieces that are both within and outside the context of fantasy sports. However, human fantasy football players can only analyze an average of 3.9 sources of information. Our work discusses the results of a machine learning pipeline to manage an ESPN Fantasy Football team. The use of trained statistical entity detectors and document2vector models applied to over 100,000 news sources and 2.3 million articles, videos and podcasts each day enables the system to comprehend natural language with an analogy test accuracy of 100% and keyword test accuracy of 80%. Deep learning feedforward neural networks provide player classifications such as if a player will be a bust, boom, play with a hidden injury or play meaningful touches with a cumulative 72% accuracy. Finally, a multiple regression ensemble uses the deep learning output and ESPN projection data to provide a point projection for each of the top 500+ fantasy football players in 2018. The point projection maintained a RMSE of 6.78 points. The best fit probability density function from a set of 24 is selected to visualize score spreads. Within the first 6 weeks of the product launch, the total number of users spent a cumulative time of over 4.6 years viewing our AI insights. The training data for our models was provided by a 2015 to 2016 web archive from Webhose, ESPN statistics, and Rotowire injury reports. We used 2017 fantasy football data as a test set.


AI can tell Republicans from Democrats – but can you? Take our quiz

The Guardian

Researchers say artificial intelligence will soon be able to detect a person's political allegiance – just by looking at photos of their face. We've put together a quiz to see if you can beat the algorithms and work out, from someone's face, their political allegiance. We've chosen 15 pictures of city councillors from Bristol, Connecticut and San Diego – eight Democrat, seven Republican. Can you figure out which is which?


A Wiki for Business Rules in Open Vocabulary, Executable English

Walker, Adrian

arXiv.org Artificial Intelligence

The problem of business-IT alignment is of widespread economic concern. As one way of addressing the problem, this paper describes an online system that functions as a kind of Wiki -- one that supports the collaborative writing and running of business and scientific applications, as rules in open vocabulary, executable English, using a browser. Since the rules are in English, they are indexed by Google and other search engines. This is useful when looking for rules for a task that one has in mind. The design of the system integrates the semantics of data, with a semantics of an inference method, and also with the meanings of English sentences. As such, the system has functionality that may be useful for the Rules, Logic, Proof and Trust requirements of the Semantic Web. The system accepts rules, and small numbers of facts, typed or copy-pasted directly into a browser. One can then run the rules, again using a browser. For larger amounts of data, the system uses information in the rules to automatically generate and run SQL over networked databases. From a few highly declarative rules, the system typically generates SQL that would be too complicated to write reliably by hand. However, the system can explain its results in step-by-step hypertexted English, at the business or scientific level As befits a Wiki, shared use of the system is free.